Code
setwd("..your working directory")In this lab we will explore some aspects of hyperspectral remote sensing obtained from the National Ecological Observatory Network (NEON) with the goal of predicting biodiversity from the sky. Specifically, we will use hyperspectral remote sensing data from the NEON Airborne Observation Platform (AOP) (more information here and here). As you will see, working with hyperspectral information/data is similar to work with any other information/data (e.g., species abundance, presence-absence) and consequently it can be used to calculate any metric of biodiversity. In this sense, spectral diversity can be considered as a dimension of biodiversity.
Note. Part of the text used in this tutorial was extracted from here with several modifications.
Set up a working directory and store the data files in that directory. Tell R that this is the directory you will be using, and read in your data:
setwd("..your working directory")To do this laboratory you will need to install the following packages:
# Package vector names
packages <- c("neonUtilities", "BiocManager", "plyr", "reshape2",
"terra", "sf", "sp", "dismo") You can use the function install.packages() to install the packages.
If you don’t want to install the packages one by one, you can use the next command.
# Install packages not yet installed
installed_packages <- packages %in% rownames(installed.packages())
if (any(installed_packages == FALSE)) {
install.packages(packages[!installed_packages], dependencies = TRUE)
}The package {rhdf5} is not available on CRAN, but it is located in “Bioconductor”.
When installing the package {rhdf5} R will ask you if you want to “Update all/some/none? [a/s/n]:” please in your console type n.
if ( ! ("rhdf5" %in% installed.packages())) {BiocManager::install("rhdf5")}Call or load all packages
sapply(packages, require, character.only = TRUE)Loading required package: neonUtilities
Loading required package: BiocManager
Loading required package: plyr
Loading required package: reshape2
Loading required package: terra
terra 1.7.39
Loading required package: sf
Linking to GEOS 3.11.0, GDAL 3.5.3, PROJ 9.1.0; sf_use_s2() is TRUE
Loading required package: sp
The legacy packages maptools, rgdal, and rgeos, underpinning the sp package,
which was just loaded, will retire in October 2023.
Please refer to R-spatial evolution reports for details, especially
https://r-spatial.org/r/2023/05/15/evolution4.html.
It may be desirable to make the sf package available;
package maintainers should consider adding sf to Suggests:.
The sp package is now running under evolution status 2
(status 2 uses the sf package in place of rgdal)
Loading required package: dismo
Loading required package: raster
neonUtilities BiocManager plyr reshape2 terra
TRUE TRUE TRUE TRUE TRUE
sf sp dismo
TRUE TRUE TRUE
library(rhdf5)
library(tidyverse)── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.4
✔ forcats 1.0.0 ✔ stringr 1.5.0
✔ ggplot2 3.4.4 ✔ tibble 3.2.1
✔ lubridate 1.9.2 ✔ tidyr 1.3.0
✔ purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::arrange() masks plyr::arrange()
✖ purrr::compact() masks plyr::compact()
✖ dplyr::count() masks plyr::count()
✖ dplyr::desc() masks plyr::desc()
✖ tidyr::extract() masks raster::extract(), terra::extract()
✖ dplyr::failwith() masks plyr::failwith()
✖ dplyr::filter() masks stats::filter()
✖ dplyr::id() masks plyr::id()
✖ dplyr::lag() masks stats::lag()
✖ dplyr::mutate() masks plyr::mutate()
✖ dplyr::rename() masks plyr::rename()
✖ dplyr::select() masks raster::select()
✖ dplyr::summarise() masks plyr::summarise()
✖ dplyr::summarize() masks plyr::summarize()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Double-check your working directory.
You can use the function getwd() to get the current working directory.
dir.create("data")Warning in dir.create("data"): 'data' already exists
dir.create("data/NEON")Warning in dir.create("data/NEON"): 'data/NEON' already exists
The first step is to prepare the information required to download the hyperspectral data from NEON-AOP. To do that we will first download the Terrestrial Observation System Sampling Locations dataset. You can download it from NEON Spatial Data & Maps, specifically under the Terrestrial Observation System Sampling Locations tab or by clicking HERE. Once you have downloaded the data, please store it within the folder data.
NEON_plots <- st_read("data/All_NEON_TOS_Plots_V10/All_NEON_TOS_Plot_Centroids_V10.shp")Reading layer `All_NEON_TOS_Plot_Centroids_V10' from data source
`/Users/jpintole/Documents/GitHub/BiodiversityScience/Spring2024/Lab_6_Spectra/data/All_NEON_TOS_Plots_V10/All_NEON_TOS_Plot_Centroids_V10.shp'
using driver `ESRI Shapefile'
Simple feature collection with 3842 features and 36 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: -156.651 ymin: 17.95158 xmax: -66.82463 ymax: 71.31671
Geodetic CRS: GCS_WGS_84_with_axis_order_normalized_for_visualization
From the spatial distribution of the NEON sites/plots we loaded into R, we will select the NEON site Harvard Forest & Quabbin Watershed (HARV).
HARV_plots <- NEON_plots %>%
filter(siteID == "HARV" & plotType == "distributed" & subtype == "basePlot") %>%
arrange(plotID)Explore the data
view(HARV_plots)Now from the NEON site HARV we will select the plot number one (plotID = HARV_001) and extract the coordinates in UTM (Universal Transverse Mercator) system as spatial reference to download the hyperspectral data from the NEON-AOP.
east <- HARV_plots$easting
names(east) <- HARV_plots$plotID
north <- HARV_plots$northing
names(north) <- HARV_plots$plotID
coords_HARV_001 <- c(east[1], north[1])
coords_HARV_001 HARV_001 HARV_001
725555.7 4700523.6
Now we have the UTM coordinates from the plot HARV_001 we can download the Hyperspectral Remote Sensing Data in HDF5 Format for that site. You can see the AOP data available HERE.
You can download the spectral data directly from HERE. Once you download the zip file, copy it into the folder data/NEON.
You can download the data using the the function “byTileAOP()” from the R package {neonUtilities}
byTileAOP(dpID = "DP3.30006.001", # NEON-AOP product
site = "HARV", # Site code
year = "2019", # Year
check.size = TRUE,
easting = coords_HARV_001[1],
northing = coords_HARV_001[2], # Coordinates UTM
savepath = "data/NEON", # Path
token = NA) When you are downloading the Hyperspectral imagery {R} will ask you if you want to download the data please in your console type y.
We are now ready for start using hyperspectral data. Note that the downloaded data is in a HDF5 format, this format natively compresses data stored within it (i.e., makes it smaller) and supports data slicing, in other words, you can extract only the portions of the data that you need to work with rather than reading the entire dataset into your computer memory.
The downloaded hyperspectral data is stored within a folder DP3.30006.001 that in turn is stored within the folder scheme Data/Neon. Please pay attention to the full folder scheme. This scheme is composed of several folders with a file NEON_D01_HARV_DP3_725000_4700000_reflectance.h5 in HDF5 format. In order to load and work with this data you might need to specify the full path to that file.
f <- "data/NEON/DP3.30006.001/neon-aop-products/2019/FullSite/D01/2019_HARV_6/L3/Spectrometer/Reflectance/NEON_D01_HARV_DP3_725000_4700000_reflectance.h5"Note that you did not loaded all hyperspectral data into R but the path to the file, so using the path we can just call the metadata of the downloaded data.
When we are exploring the data structure in a HDF5 file, we really need to pay attention to the first two columns, these two columns is informing us about the location of the data (group) and the name of the data (name) stored into the file. For example the Map_info dataset is located in /HARV/Reflectance/Metadata/Coordinate_System and the Reflectance dataset under /HARV.
The wavelength dataset contains the middle wavelength values for each band in the dataset and the Reflectance dataset contains the image data, these two datasets are going to be used for both data processing and visualization.
View(h5ls(f, all = TRUE))A band represents a group of wavelengths. For example, the wavelength values between 695 nm and 700 nm might be one band as captured by an imaging spectrometer. The imaging spectrometer collects reflected light energy in a pixel for light in that band. When we are working with a multispectral (e.g., Landsat, MODIS) or hyperspectral (e.g., NEON-AOP - NASA/JPL AVIRIS-NG) dataset, the band information is reported as the center wavelength value. This value represents the center point value of the wavelengths represented in that band. Thus in a band spanning 695-700 nm, the center would be 697.5 nm.
# Get information about the wavelengths of HARV plot 001
wlInfo <- h5readAttributes(f, "/HARV/Reflectance/Metadata/Spectral_Data/Wavelength")
wlInfo$Description
[1] "Central wavelength of the reflectance bands."
$Units
[1] "nanometers"
To read the wavelenghts of our hyperspectral image we just use the function “h5read”
WL <- h5read(f, "/HARV/Reflectance/Metadata/Spectral_Data/Wavelength")
head(WL)[1] 383.7655 388.7733 393.7812 398.7890 403.7968 408.8047
tail(WL)[1] 2487.058 2492.066 2497.073 2502.081 2507.089 2512.097
To extract the full metadata of the hyperspectral image we use the function “h5readAttributes”. Note: please, read the content of the the metadata carefully.
reflInfo <- h5readAttributes(f, "/HARV/Reflectance/Reflectance_Data")
reflInfo$Cloud_conditions
[1] "For cloud conditions information see Weather Quality Index dataset."
$Cloud_type
[1] "Cloud type may have been selected from multiple flight trajectories."
$Data_Ignore_Value
[1] -9999
$Description
[1] "Atmospherically corrected reflectance."
$Dimension_Labels
[1] "Line, Sample, Wavelength"
$Dimensions
[1] 1000 1000 426
$Interleave
[1] "BSQ"
$Scale_Factor
[1] 10000
$Spatial_Extent_meters
[1] 725000 726000 4700000 4701000
$Spatial_Resolution_X_Y
[1] 1 1
$Units
[1] "Unitless."
$Units_Valid_range
[1] 0 10000
For example, the center wavelength value associated with the band 34 is 549.0242
WL[34][1] 549.0242
Importnatly, the “h5read” function reads data in the order: Bands, Cols, Rows. Let’s get these information from the reflectance metadata.
nRows <- reflInfo$Dimensions[1]
nCols <- reflInfo$Dimensions[2]
nBands <- reflInfo$Dimensions[3]
nRows[1] 1000
nCols[1] 1000
nBands[1] 426
Now using the the information obtained from the reflectance metadata let’s extract the band 34. You can try other band if you prefer.
b34 <- h5read(f, "/HARV/Reflectance/Reflectance_Data",
index = list(34, 1:nCols, 1:nRows)) # get band 34
# what type of object is b34?
class(b34)[1] "array"
## [1] "array"The returned object is an array, the arrays are matrices with more than 2 dimensions, i.e., are matrices stacked or piled in a single object.
# convert from array to matrix by selecting only the first band
b34 <- b34[1,, ]
# check it
class(b34)[1] "matrix" "array"
Now we can plot the image of band 34.
image(b34)The previous image is hard to visually interpret, let’s log the data and see what happens.
image(log(b34))An image data in raster format will often contain a data ignore value and a scale factor. The data ignore value represents pixels where there are no data. Usually, no data values may be attributed to the sensor not collecting data in the area of the image or to processing results which yield null values.
From the reflectance metadata we can define the ignore value as -9999. Thus, let’s set all pixels with a value == -9999 to NA (no value).
# there is NO data value in our raster - let's define it
myNoDataValue <- as.numeric(reflInfo$Data_Ignore_Value)
myNoDataValue[1] -9999
# set all values equal to -9999 to NA
b34[b34 == myNoDataValue] <- NA
# plot the image now
image(b34)In order to get a raster file suitable for further analysis, we first need to define the Coordinate reference system (CRS) of the raster. Again, we obtain the necessary information form the HDF5 file.
# Extract the EPSG from the h5 dataset
myEPSG <- h5read(f, "/HARV/Reflectance/Metadata/Coordinate_System/EPSG Code")
myEPSG[1] "32618"
# convert the EPSG code to a CRS string
myCRS <- crs(paste0("+init=epsg:", myEPSG))
myCRS[1] "PROJCRS[\"WGS 84 / UTM zone 18N\",\n BASEGEOGCRS[\"WGS 84\",\n ENSEMBLE[\"World Geodetic System 1984 ensemble\",\n MEMBER[\"World Geodetic System 1984 (Transit)\"],\n MEMBER[\"World Geodetic System 1984 (G730)\"],\n MEMBER[\"World Geodetic System 1984 (G873)\"],\n MEMBER[\"World Geodetic System 1984 (G1150)\"],\n MEMBER[\"World Geodetic System 1984 (G1674)\"],\n MEMBER[\"World Geodetic System 1984 (G1762)\"],\n MEMBER[\"World Geodetic System 1984 (G2139)\"],\n ELLIPSOID[\"WGS 84\",6378137,298.257223563,\n LENGTHUNIT[\"metre\",1]],\n ENSEMBLEACCURACY[2.0]],\n PRIMEM[\"Greenwich\",0,\n ANGLEUNIT[\"degree\",0.0174532925199433]],\n ID[\"EPSG\",4326]],\n CONVERSION[\"UTM zone 18N\",\n METHOD[\"Transverse Mercator\",\n ID[\"EPSG\",9807]],\n PARAMETER[\"Latitude of natural origin\",0,\n ANGLEUNIT[\"degree\",0.0174532925199433],\n ID[\"EPSG\",8801]],\n PARAMETER[\"Longitude of natural origin\",-75,\n ANGLEUNIT[\"degree\",0.0174532925199433],\n ID[\"EPSG\",8802]],\n PARAMETER[\"Scale factor at natural origin\",0.9996,\n SCALEUNIT[\"unity\",1],\n ID[\"EPSG\",8805]],\n PARAMETER[\"False easting\",500000,\n LENGTHUNIT[\"metre\",1],\n ID[\"EPSG\",8806]],\n PARAMETER[\"False northing\",0,\n LENGTHUNIT[\"metre\",1],\n ID[\"EPSG\",8807]],\n ID[\"EPSG\",16018]],\n CS[Cartesian,2],\n AXIS[\"(E)\",east,\n ORDER[1],\n LENGTHUNIT[\"metre\",1,\n ID[\"EPSG\",9001]]],\n AXIS[\"(N)\",north,\n ORDER[2],\n LENGTHUNIT[\"metre\",1,\n ID[\"EPSG\",9001]]],\n USAGE[\n SCOPE[\"unknown\"],\n AREA[\"Between 78°W and 72°W, northern hemisphere between equator and 84°N, onshore and offshore. Bahamas. Canada - Nunavut; Ontario; Quebec. Colombia. Cuba. Ecuador. Greenland. Haiti. Jamaica. Panama. Turks and Caicos Islands. United States (USA). Venezuela.\"],\n BBOX[0,-78,84,-72]]]"
Define final raster with projection info.
b34ras <- rast(b34, crs = myCRS)
# view the raster attributes
b34rasclass : SpatRaster
dimensions : 1000, 1000, 1 (nrow, ncol, nlyr)
resolution : 1, 1 (x, y)
extent : 0, 1000, 0, 1000 (xmin, xmax, ymin, ymax)
coord. ref. : WGS 84 / UTM zone 18N
source(s) : memory
name : lyr.1
min value : 0
max value : 2839
Let’s take a look at the georeferenced raster. Take note of the coordinates on the x and y axis.
image(log(b34ras),
xlab = "UTM Easting",
ylab = "UTM Northing",
main = "Properly Oriented Raster")Next we define the extents of our raster. The extents will be used to calculate the raster’s resolution. We get this information from the reflectance information obtained in a previous step.
# Grab the UTM coordinates of the spatial extent
xMin <- reflInfo$Spatial_Extent_meters[1]
xMax <- reflInfo$Spatial_Extent_meters[2]
yMin <- reflInfo$Spatial_Extent_meters[3]
yMax <- reflInfo$Spatial_Extent_meters[4]Define the spatial extent of the raster image
# define the extent (left, right, top, bottom)
rasExt <- ext(xMin, xMax, yMin, yMax)
# view the extent to make sure that it looks right
rasExt SpatExtent : 725000, 726000, 4700000, 4701000 (xmin, xmax, ymin, ymax)
Assign the spatial extent to the raster
ext(b34ras) <- rasExt
# look at raster attributes
b34rasclass : SpatRaster
dimensions : 1000, 1000, 1 (nrow, ncol, nlyr)
resolution : 1, 1 (x, y)
extent : 725000, 726000, 4700000, 4701000 (xmin, xmax, ymin, ymax)
coord. ref. : WGS 84 / UTM zone 18N
source(s) : memory
name : lyr.1
min value : 0
max value : 2839
Visualize the raster file but now let’s change the colors by adjusting the zlims.
col <- terrain.colors(25)
image(b34ras,
xlab = "UTM Easting",
ylab = "UTM Northing",
main = "Raster with Custom Colors",
col = col,
zlim = c(0, 1000))We can now save the created raster.
# Create a new folder to store the results
dir.create("output")Warning in dir.create("output"): 'output' already exists
# write out the raster as a geotiff
writeRaster(b34ras,
file = "output/HARV_plot_001_band_34.tif",
overwrite = TRUE)In the previous step we created a raster file of a single band, but each hyperspectral data from the NEON-AOP contains 426 bands. In this step we will construct a raster stack file, i.e., a raster with N bands, in other words, a raster of rasters. You can do it manually as we did for the band 34, but let’s take advantage of R and write a function that do the work for us. Please, take your time to read and understand the function.
# file: the hdf file
# band: the band you want to process
# noDataValue: values to be omitted
# extent: raster extent
# CRS: coordinates system
# returns: a matrix containing the reflectance data for the specific band
band2Raster <- function(file, band, noDataValue, extent, CRS){
# first, read in the raster
out <- h5read(file, "/HARV/Reflectance/Reflectance_Data",
index = list(band, NULL, NULL)) # path to the HDF5 file
# Convert from array to matrix
out <- (out[1,, ]) # output
# transpose data to fix flipped row and column order
# depending upon how your data are formatted you might not have to perform this
# step.
out <- t(out)
# assign data ignore values to NA
# note, you might chose to assign values of 15000 to NA
out[out == myNoDataValue] <- NA
# turn the out object into a raster
outr <- rast(out, crs = CRS)
# assign the extents to the raster
ext(outr) <- extent
# return the raster object
return(outr)
}Now apply the function to create a RGB raster file, to do this, we will use the bands 58, 34 and 19, respectively.
# create a list of the bands we want in our stack
rgb <- list(58, 34, 19)
# lapply tells R to apply the function to each element in the list
rgb_harv <- lapply(rgb, # band numbers
FUN = band2Raster, # function
file = f, # file from where data is extracted
noDataValue = myNoDataValue, # No data
extent = rasExt, # extent
CRS = myCRS) # Coordinate system
# check out the properties or rgb_rast
# note that it displays properties of 3 rasters.
rgb_harv[[1]]
class : SpatRaster
dimensions : 1000, 1000, 1 (nrow, ncol, nlyr)
resolution : 1, 1 (x, y)
extent : 725000, 726000, 4700000, 4701000 (xmin, xmax, ymin, ymax)
coord. ref. : WGS 84 / UTM zone 18N
source(s) : memory
name : lyr.1
min value : 0
max value : 3381
[[2]]
class : SpatRaster
dimensions : 1000, 1000, 1 (nrow, ncol, nlyr)
resolution : 1, 1 (x, y)
extent : 725000, 726000, 4700000, 4701000 (xmin, xmax, ymin, ymax)
coord. ref. : WGS 84 / UTM zone 18N
source(s) : memory
name : lyr.1
min value : 0
max value : 2839
[[3]]
class : SpatRaster
dimensions : 1000, 1000, 1 (nrow, ncol, nlyr)
resolution : 1, 1 (x, y)
extent : 725000, 726000, 4700000, 4701000 (xmin, xmax, ymin, ymax)
coord. ref. : WGS 84 / UTM zone 18N
source(s) : memory
name : lyr.1
min value : 0
max value : 2097
Success!!! 🎉🎉🎉 we created a list of three rasters. Now in order to get the raster stack just apply the function stack() from the package {raster}.
# Create a raster stack from our list of rasters
rgb_harv_stack <- rast(rgb_harv)
rgb_harv_stackclass : SpatRaster
dimensions : 1000, 1000, 3 (nrow, ncol, nlyr)
resolution : 1, 1 (x, y)
extent : 725000, 726000, 4700000, 4701000 (xmin, xmax, ymin, ymax)
coord. ref. : WGS 84 / UTM zone 18N
source(s) : memory
names : lyr.1, lyr.1, lyr.1
min values : 0, 0, 0
max values : 3381, 2839, 2097
As a final step, let’s assign the names of the bands to the raster stack object.
# Create a list of band names
bandNames <- paste("Band_", unlist(rgb), sep = "")
# set the rasterStack's names equal to the list of bandNames created above
names(rgb_harv_stack) <- bandNames
# check properties of the raster list - note the band names
rgb_harv_stackclass : SpatRaster
dimensions : 1000, 1000, 3 (nrow, ncol, nlyr)
resolution : 1, 1 (x, y)
extent : 725000, 726000, 4700000, 4701000 (xmin, xmax, ymin, ymax)
coord. ref. : WGS 84 / UTM zone 18N
source(s) : memory
names : Band_58, Band_34, Band_19
min values : 0, 0, 0
max values : 3381, 2839, 2097
# scale the data as specified in the reflInfo$Scale Factor
rgb_harv_stack <- rgb_harv_stack/as.integer(reflInfo$Scale_Factor)
# plot one raster in the stack to make sure things look OK.
plot(rgb_harv_stack$Band_58, main = "Band 58")And plot resulting RGB raster.
# create a 3 band RGB image
plotRGB(rgb_harv_stack, # rasterstack
r = 1, # red band
g = 2, # green band
b = 3, # blue band
stretch = "lin")Cool, right?
As we did with the band 34, let’s save the RGB raster in a GeoTIFF format.
# write out final raster
writeRaster(rgb_harv_stack,
file = "output/HARV_plot_001_RGB.tif",
overwrite = TRUE)Now, we will calculate vegetation indices, specifically, we will calculate the Normalized Difference Vegetation Index (NDVI).
The \(NDVI\) is computed as the difference between near-infrared (\(NIR\)) and red (\(RED\)) reflectance divided by their sum.:
\[NDVI = \frac{NIR - R}{NIR + R}\]
To calculate the NDVI from the NEON-AOP hyperspectral data, first select the bands 58 (red) and 90 (NIR) and then create a raster stack as we did for the RGB raster.
# select bands to use in calculation (red, NIR)
ndvi_bands <- c(58, 90) #bands c(58, 90) in full NEON hyperspectral dataset
# create raster list and then a stack using those two bands
ndvi_harv <- lapply(ndvi_bands,
FUN = band2Raster,
file = f,
noDataValue = myNoDataValue,
extent = rasExt,
CRS = myCRS)
ndvi_harv <- rast(ndvi_harv)
# make the names pretty
bandNDVINames <- paste("Band_", unlist(ndvi_bands), sep = "")
names(ndvi_harv) <- bandNDVINames
# view the properties of the new raster stack
ndvi_harvclass : SpatRaster
dimensions : 1000, 1000, 2 (nrow, ncol, nlyr)
resolution : 1, 1 (x, y)
extent : 725000, 726000, 4700000, 4701000 (xmin, xmax, ymin, ymax)
coord. ref. : WGS 84 / UTM zone 18N
source(s) : memory
names : Band_58, Band_90
min values : 0, 0
max values : 3381, 8444
Write a function for NDVI calculation.
#calculate NDVI
NDVI_func <- function(ras) {
(ras[, 2] - ras[, 1])/(ras[, 2] + ras[, 1])
}Apply the function and plot the result.
ndvi_calc <- app(ndvi_harv,
fun = NDVI_func)
plot(ndvi_calc,
main = "NDVI for the NEON HARV Field Site")Now, play with breaks and colors to create a meaningful map add a color map with 4 colors.
myCol <- rev(terrain.colors(4)) # use the 'rev()' function to put green as the highest NDVI value
# add breaks to the colormap, including lowest and highest values (4 breaks = 3 segments)
brk <- c(0, .25, .5, .75, 1)
# plot the image using breaks
plot(ndvi_calc,
main = "NDVI for the NEON HARV Field Site",
col = myCol,
breaks = brk)We can save the resulting NDVI raster.
writeRaster(ndvi_calc,
file = "output/HARV_plot_001_NDVI.tif",
overwrite = TRUE)Now we will extract all reflectance values for a selected pixel in the hyperspectral data from HARV and use this values to plot its spectral signatures. For practice purpose we will select a pixel at the position (100, 35), in other words, we are selecting the pixel at the row 100 and column 35. In addition to get the the reflectance for all bands we need to inform an empty space (as NULL).
# extract all bands from a single pixel
aPixel <- h5read(f, "/HARV/Reflectance/Reflectance_Data",
index = list(NULL, 100, 35))
class(aPixel)[1] "array"
# The line above generates a vector of reflectance values.
# Next, we reshape the data and turn them into a dataframe
b <- adply(aPixel, c(1))
class(b)[1] "data.frame"
# create clean data frame
aPixeldf <- b[2]
# add wavelength data to matrix
aPixeldf$Wavelength <- WL
head(aPixeldf) V1 Wavelength
1 328 383.7655
2 279 388.7733
3 257 393.7812
4 234 398.7890
5 227 403.7968
6 244 408.8047
Now select the scale factor from the reflectance object and scale the reflectance values for all bands.
scaleFact <- reflInfo$Scale_Factor
# add scaled data column to the data frame
aPixeldf$scaled <- (aPixeldf$V1/as.vector(scaleFact))
# make nice column names
names(aPixeldf) <- c('Reflectance', 'Wavelength', 'ScaledReflectance')
head(aPixeldf) Reflectance Wavelength ScaledReflectance
1 328 383.7655 0.0328
2 279 388.7733 0.0279
3 257 393.7812 0.0257
4 234 398.7890 0.0234
5 227 403.7968 0.0227
6 244 408.8047 0.0244
tail(aPixeldf) Reflectance Wavelength ScaledReflectance
421 0 2487.058 0.0000
422 0 2492.066 0.0000
423 0 2497.073 0.0000
424 0 2502.081 0.0000
425 0 2507.089 0.0000
426 1043 2512.097 0.1043
As a last step let’s plot the scaled reflectance as a function of the wavelength.
aPixeldf %>%
ggplot(aes(x = Wavelength, y = ScaledReflectance)) +
geom_line() +
xlab("Wavelength (nm)") +
ylab("Reflectance") +
theme_bw()In the previous step we selected and arbitrary pixel at (100, 35), however when we are working with real data we might need to have the spatial position of objects on the ground (e.g., GPS points).
We will use the exact geographical position of tree species in the HARV site. To do that, we will need to install two more packages that are not in CRAN but on GitHub. The packages are geoNEON and neonhs
First install and call the package geoNEON
# install the package {geoNEON}
#remotes::install_github('NEONScience/NEON-geolocation/geoNEON', dependencies = TRUE)
if ( ! ("rhdf5" %in% installed.packages()))
{remotes::install_github('NEONScience/NEON-geolocation/geoNEON',
dependencies = TRUE)}
library(geoNEON)Now download the data from woody plant vegetation from NEON.
# Download Woody plant vegetation structure from NEON #####
zipsByProduct(
dpID = "DP1.10098.001",
site = "HARV",
savepath = "data/NEON",
check.size = FALSE
)Finding available files
|
| | 0%
|
|== | 2%
|
|=== | 5%
|
|===== | 7%
|
|====== | 9%
|
|======== | 11%
|
|========== | 14%
|
|=========== | 16%
|
|============= | 18%
|
|============== | 20%
|
|================ | 23%
|
|================== | 25%
|
|=================== | 27%
|
|===================== | 30%
|
|====================== | 32%
|
|======================== | 34%
|
|========================= | 36%
|
|=========================== | 39%
|
|============================= | 41%
|
|============================== | 43%
|
|================================ | 45%
|
|================================= | 48%
|
|=================================== | 50%
|
|===================================== | 52%
|
|====================================== | 55%
|
|======================================== | 57%
|
|========================================= | 59%
|
|=========================================== | 61%
|
|============================================= | 64%
|
|============================================== | 66%
|
|================================================ | 68%
|
|================================================= | 70%
|
|=================================================== | 73%
|
|==================================================== | 75%
|
|====================================================== | 77%
|
|======================================================== | 80%
|
|========================================================= | 82%
|
|=========================================================== | 84%
|
|============================================================ | 86%
|
|============================================================== | 89%
|
|================================================================ | 91%
|
|================================================================= | 93%
|
|=================================================================== | 95%
|
|==================================================================== | 98%
|
|======================================================================| 100%
Downloading files totaling approximately 118.120911 MB
data/NEON/filesToStack10098 already exists. Download will proceed, but check for duplicate files.
Downloading 44 files
|
| | 0%
|
|== | 2%
|
|=== | 5%
|
|===== | 7%
|
|======= | 9%
|
|======== | 12%
|
|========== | 14%
|
|=========== | 16%
|
|============= | 19%
|
|=============== | 21%
|
|================ | 23%
|
|================== | 26%
|
|==================== | 28%
|
|===================== | 30%
|
|======================= | 33%
|
|======================== | 35%
|
|========================== | 37%
|
|============================ | 40%
|
|============================= | 42%
|
|=============================== | 44%
|
|================================= | 47%
|
|================================== | 49%
|
|==================================== | 51%
|
|===================================== | 53%
|
|======================================= | 56%
|
|========================================= | 58%
|
|========================================== | 60%
|
|============================================ | 63%
|
|============================================== | 65%
|
|=============================================== | 67%
|
|================================================= | 70%
|
|================================================== | 72%
|
|==================================================== | 74%
|
|====================================================== | 77%
|
|======================================================= | 79%
|
|========================================================= | 81%
|
|=========================================================== | 84%
|
|============================================================ | 86%
|
|============================================================== | 88%
|
|=============================================================== | 91%
|
|================================================================= | 93%
|
|=================================================================== | 95%
|
|==================================================================== | 98%
|
|======================================================================| 100%
44 files successfully downloaded to data/NEON/filesToStack10098
## Combine the files
stackByTable("data/NEON/filesToStack10098", folder = TRUE)Unpacking zip files using 1 cores.
Stacking operation across a single core.
Stacking table vst_apparentindividual
Stacking table vst_mappingandtagging
Stacking table vst_perplotperyear
Stacking table vst_non-woody
Stacking table vst_shrubgroup
Copied the most recent publication of validation file to /stackedFiles
Copied the most recent publication of categoricalCodes file to /stackedFiles
Copied the most recent publication of variable definition file to /stackedFiles
Finished: Stacked 5 data tables and 4 metadata tables!
Stacking took 1.279827 secs
All unzipped monthly data folders have been removed.
The function “getLocTOS” will refine the geolocation data associated with NEON data products.
# Calculate the more precise location for each NEON plot in the HARV site
vegmap <- "data/NEON/filesToStack10098/stackedFiles/vst_mappingandtagging.csv" %>%
read_csv() %>%
mutate(year = substr(date, 1, 4)) %>%
filter(year == '2019') %>%
getLocTOS("vst_mappingandtagging") # Calculate more precise geolocations for specific NEON data productsWarning: One or more parsing issues, call `problems()` on your data frame for details,
e.g.:
dat <- vroom(...)
problems(dat)
Rows: 8503 Columns: 31
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (22): uid, namedLocation, eventID, domainID, siteID, plotID, recordType...
dbl (3): pointID, stemDistance, stemAzimuth
lgl (4): identificationHistoryID, morphospeciesID, morphospeciesIDRemarks,...
dttm (1): publicationDate
date (1): date
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
|
| | 0%
|
|= | 2%
|
|== | 3%
|
|=== | 5%
|
|===== | 6%
|
|====== | 8%
|
|======= | 10%
|
|======== | 11%
|
|========= | 13%
|
|========== | 15%
|
|=========== | 16%
|
|============ | 18%
|
|============== | 19%
|
|=============== | 21%
|
|================ | 23%
|
|================= | 24%
|
|================== | 26%
|
|=================== | 27%
|
|==================== | 29%
|
|===================== | 31%
|
|======================= | 32%
|
|======================== | 34%
|
|========================= | 35%
|
|========================== | 37%
|
|=========================== | 39%
|
|============================ | 40%
|
|============================= | 42%
|
|============================== | 44%
|
|================================ | 45%
|
|================================= | 47%
|
|================================== | 48%
|
|=================================== | 50%
|
|==================================== | 52%
|
|===================================== | 53%
|
|====================================== | 55%
|
|======================================== | 56%
|
|========================================= | 58%
|
|========================================== | 60%
|
|=========================================== | 61%
|
|============================================ | 63%
|
|============================================= | 65%
|
|============================================== | 66%
|
|=============================================== | 68%
|
|================================================= | 69%
|
|================================================== | 71%
|
|=================================================== | 73%
|
|==================================================== | 74%
|
|===================================================== | 76%
|
|====================================================== | 77%
|
|======================================================= | 79%
|
|======================================================== | 81%
The following namedLocation was not found: HARV_022.basePlot.vst.42
|
|========================================================== | 82%
|
|=========================================================== | 84%
|
|============================================================ | 85%
|
|============================================================= | 87%
|
|============================================================== | 89%
|
|=============================================================== | 90%
|
|================================================================ | 92%
|
|================================================================= | 94%
|
|=================================================================== | 95%
|
|==================================================================== | 97%
|
|===================================================================== | 98%
|
|======================================================================| 100%
# Load individual tree coordinates
vegind <- read_csv("data/NEON/filesToStack10098/stackedFiles/vst_apparentindividual.csv")Rows: 24925 Columns: 43
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (20): uid, namedLocation, eventID, domainID, siteID, plotID, subplotID,...
dbl (13): tempStemID, stemDiameter, measurementHeight, height, baseCrownHei...
lgl (8): dendrometerInstallationDate, initialGapMeasurementDate, initialBa...
dttm (1): publicationDate
date (1): date
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# Combine the coordinates with three identification
vegetation <- right_join(vegind, vegmap,
by = c("individualID", "namedLocation", "domainID", "siteID", "plotID")) %>%
filter(!is.na(adjEasting), !is.na(adjNorthing),
plantStatus == "Live")Now select the individual trees available for the plot HARV_001 and transform it into a spatial object.
harv_01_trees <- vegetation %>%
select(adjNorthing, adjEasting, scientificName, plotID,
adjDecimalLatitude, adjDecimalLongitude) %>%
filter(plotID == "HARV_001")SF format
# SF format
harv_01_trees_sf <- st_as_sf(harv_01_trees,
coords = c("adjEasting", "adjNorthing"),
crs = crs(ndvi_calc)) SP format
# SPT format
harv_01_trees_spt <- harv_01_trees
coordinates(harv_01_trees_spt) <- ~ adjEasting + adjNorthing
crs(harv_01_trees_spt) <- crs(ndvi_calc)The plot HARV_001 is composed by four species and 38 individuals. You can verify that information.
unique(harv_01_trees$scientificName)[1] "Quercus rubra L." "Acer rubrum L." "Pinus strobus L." "Betula lenta L."
length(harv_01_trees$scientificName)[1] 51
Plot the results
plot(ndvi_harv[[1]])
plot(harv_01_trees_sf, add = TRUE)Warning in plot.sf(harv_01_trees_sf, add = TRUE): ignoring all but the first
attribute
This is not over, now we will extract the spectral information for each of our trees in the plot HARV_001. For this we will use the R package {neonhs}
First install and call the package neonhs
#remotes::install_github('earthlab/neonhs')
if ( ! ("rhdf5" %in% installed.packages()))
{remotes::install_github('earthlab/neonhs')}
library(neonhs)Extract the spectra data associated with each tree species.
# Path to access the hyperspectral data
hs_path_2019 <- list.files(
path = "data/NEON/DP3.30006.001/neon-aop-products/2019/",
pattern = "reflectance.h5",
recursive = TRUE, full.names = TRUE
)
# extract the spectra data
resHARV_001 <- neonhs::hs_extract_pts(filename = hs_path_2019, # path to the h5 file
pts = harv_01_trees_spt, # spatial points
bands = 1:426) # which bandsWarning in CPL_crs_from_input(x): GDAL Message 1: +init=epsg:XXXX syntax is
deprecated. It might return a CRS with a non-EPSG compliant axis order.
resHARV_001class : SpatialPointsDataFrame
features : 51
extent : 725546.2, 725561.7, 4700519, 4700534 (xmin, xmax, ymin, ymax)
crs : +proj=utm +zone=18 +datum=WGS84 +units=m +no_defs
variables : 432
names : scientificName, plotID, adjDecimalLatitude, adjDecimalLongitude, band1_384nm, band2_389nm, band3_394nm, band4_399nm, band5_404nm, band6_409nm, band7_414nm, band8_419nm, band9_424nm, band10_429nm, band11_434nm, ...
min values : Acer rubrum L., HARV_001, 42.4242049282616, -72.2584001428777, 0, 0, 6e-04, 0.0048, 0.0059, 0.0044, 0.006, 0.0054, 0.0041, 0.0062, 0.0058, ...
max values : Quercus rubra L., HARV_001, 42.4243388123685, -72.2582130792827, 0.0272, 0.0248, 0.0245, 0.023, 0.0235, 0.0257, 0.0215, 0.0233, 0.0225, 0.0247, 0.0266, ...
The object with the spectral data is a SpatialPointsDataFrame we need to transform it into a data.frame to continue working with the spectra.
resHARV_001_df <- as.data.frame(resHARV_001) %>%
bind_rows() %>%
tibble() %>%
dplyr::select(!c("band418_2472nm", "band419_2477nm", "band420_2482nm", "band421_2487nm",
"band422_2492nm", "band423_2497nm", "band424_2502nm", "band425_2507nm",
"band426_2512nm", "adjEasting.1", "adjNorthing.1")) %>%
dplyr::select(plotID, scientificName, adjNorthing, adjEasting, adjDecimalLongitude, adjDecimalLatitude, everything())
resHARV_001_df# A tibble: 51 × 423
plotID scientificName adjNorthing adjEasting adjDecimalLongitude
<chr> <chr> <dbl> <dbl> <dbl>
1 HARV_001 Quercus rubra L. 4700521. 725555. -72.3
2 HARV_001 Acer rubrum L. 4700519. 725555. -72.3
3 HARV_001 Acer rubrum L. 4700521. 725555. -72.3
4 HARV_001 Pinus strobus L. 4700532. 725546. -72.3
5 HARV_001 Quercus rubra L. 4700526. 725547. -72.3
6 HARV_001 Quercus rubra L. 4700533. 725547. -72.3
7 HARV_001 Pinus strobus L. 4700519. 725546. -72.3
8 HARV_001 Quercus rubra L. 4700523. 725562. -72.3
9 HARV_001 Acer rubrum L. 4700525. 725547. -72.3
10 HARV_001 Pinus strobus L. 4700527. 725559. -72.3
# ℹ 41 more rows
# ℹ 418 more variables: adjDecimalLatitude <dbl>, band1_384nm <dbl>,
# band2_389nm <dbl>, band3_394nm <dbl>, band4_399nm <dbl>, band5_404nm <dbl>,
# band6_409nm <dbl>, band7_414nm <dbl>, band8_419nm <dbl>, band9_424nm <dbl>,
# band10_429nm <dbl>, band11_434nm <dbl>, band12_439nm <dbl>,
# band13_444nm <dbl>, band14_449nm <dbl>, band15_454nm <dbl>,
# band16_459nm <dbl>, band17_464nm <dbl>, band18_469nm <dbl>, …
Let’s perform a bit of cleaning…
resHARV_001_df_long <- resHARV_001_df %>%
dplyr::select(!c(plotID, adjNorthing, adjEasting, adjDecimalLongitude, adjDecimalLatitude)) %>%
reshape2::melt(id.vars = "scientificName",
variable.name = "Wavelength",
value.name = "Reflectance")
resHARV_001_df_long <- resHARV_001_df_long %>%
mutate(Wavelength2 = Wavelength) %>%
separate(Wavelength2, into = c("bands", "wl"), sep = "_") %>%
mutate(WL = as.numeric(gsub("[nm]", "", wl))) %>%
mutate(scientificName = as.factor(scientificName))Now let’s plot the results…
# Aux function for visualization
theme_lab6 <- function() {
theme_bw() +
theme(panel.grid.minor = element_blank(),
plot.background = element_rect(fill = "white", color = NA),
strip.background = element_rect(fill = "grey80", color = NA),
legend.title = element_text(face = "bold", size = 15),
legend.text = element_text(size = 12))
}
# plot the spectra by species
resHARV_001_df_long %>%
drop_na() %>%
ggplot() +
geom_line(aes(x = WL, y = Reflectance, color = scientificName)) +
scale_color_viridis_d(option = "A",
labels = c("Acer rubrum",
"Betula lenta",
"Pinus strobus",
"Quercus rubra")) +
xlab("Wavelength (nm)") +
ylab("Reflectance") +
theme_lab6()Cool! we just made a plot with the spectral signatures for four tree species at HARV site, but there are some anomalies in the plot which difficult its interpretation. Those anomalies around 1400 nm and 1850 nm correspond to two major atmospheric absorption bands, i.e., regions in the spectra where gasses in the atmosphere (primarily carbon dioxide and water vapor) absorb radiation, and therefore, obscure the reflected radiation that the imaging spectrometer measures.
To eliminate those anomalies we first might need to select and eliminate those bands manually. Happily, the reflectance metadata contains the lower and upper bound of each of those atmospheric absorption bands. Let’s read those bands and plot rectangles where the reflectance measurements are obscured by atmospheric absorbtion.
# grab Reflectance metadata (which contains absorption band limits)
reflMetadata <- h5readAttributes(f, "/HARV/Reflectance" )
ab1 <- reflMetadata$Band_Window_1_Nanometers
ab2 <- reflMetadata$Band_Window_2_Nanometers
ab1[1] 1340 1445
ab2[1] 1790 1955
Plot spectral signatures again with rectangles showing the absorption bands
resHARV_001_df_long %>%
drop_na() %>%
ggplot() +
geom_line(aes(x = WL, y = Reflectance, color = scientificName)) +
scale_color_viridis_d(option = "A",
labels = c("Acer rubrum",
"Betula lenta",
"Pinus strobus",
"Quercus rubra")) +
geom_rect(mapping = aes(ymin = min(Reflectance),
ymax = max(Reflectance),
xmin = ab1[1], xmax = ab1[2]),
color = "darkgray", fill = "gray", alpha = 0.7) +
geom_rect(mapping = aes(ymin = min(Reflectance),
ymax = max(Reflectance),
xmin = ab2[1], xmax = ab2[2]),
color = "darkgray", fill = "gray", alpha = 0.7) +
xlab("Wavelength (nm)") +
ylab("Reflectance") +
theme_lab6()By inspecting the plot we can confirm that the sections of the spectra with anomalies are within the atmospheric absorption bands. Using the absorption band limits we can remove the sections with anomalies and plot masked spectral signatures for the four species.
# Duplicate the spectral signatures into a new data.frame
resHARV_001_df_long_mask <- resHARV_001_df_long
# Mask out all values within each of the two atmospheric absorbtion bands
resHARV_001_df_long_mask[resHARV_001_df_long_mask$WL >
ab1[1] & resHARV_001_df_long_mask$WL < ab1[2], ]$Reflectance <- NA
resHARV_001_df_long_mask[resHARV_001_df_long_mask$WL >
ab2[1] & resHARV_001_df_long_mask$WL < ab2[2], ]$Reflectance <- NA
head(resHARV_001_df_long_mask) scientificName Wavelength Reflectance bands wl WL
1 Quercus rubra L. band1_384nm 0.0114 band1 384nm 384
2 Acer rubrum L. band1_384nm 0.0009 band1 384nm 384
3 Acer rubrum L. band1_384nm 0.0000 band1 384nm 384
4 Pinus strobus L. band1_384nm 0.0082 band1 384nm 384
5 Quercus rubra L. band1_384nm 0.0000 band1 384nm 384
6 Quercus rubra L. band1_384nm 0.0028 band1 384nm 384
Plot the masked spectral signatures
resHARV_001_df_long_mask %>%
ggplot() +
geom_line(aes(x = WL, y = Reflectance, color = scientificName)) +
scale_color_viridis_d(option = "A",
labels = c("Acer rubrum",
"Betula lenta",
"Pinus strobus",
"Quercus rubra")) +
xlab("Wavelength (nm)") +
ylab("Reflectance") +
theme_lab6()It’s always good practice to close the H5 connection before moving on.
# close the H5 file
H5close()Clean your R environment.
rm(list = ls())That’s it!
This was a very long lab, and it will take sometime to digest all the information. Thus the challenge for this lab is simple. Prepare a document with all the figures generated in this tutorial.